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SUMMARY 


In  this  note  we  wish  to  Indicate  some  ways  In  which  the 
theory  of  approximation  can  be  used  to  Increase  the  range  of 
present  day  computers.  Although  we  are  primarily  Interested 
In  applying  these  techniques  to  the  functional  equations 
occurring  In  the  theory  of  dynamic  programming.  It  should  be 
noted  that  these  same  methods  are  applicable,  and  even  more 
readily,  to  the  classical  functional  equations  of  mathematical 
physics. 

What  we  wish  to  do  Is  to  trade  additional  computing  time, 
which  Is  expensive,  for  additional  memory  capacity,  which 


does  not  exist. 
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PUNCTIONAL  APPROXIMATIONS  AND  DYNAMIC  PROORAMMINO 

Richard  Bellman 
Stuart  Dreyfus 


1.  Introduction 

In  this  note  we  wish  to  Indicate  some  ways  In  which  the 
theory  of  approximation  can  be  used  to  increase  the  range  of 
present  day  computers.  Although  we  are  primarily  Interested 
In  applying  these  techniques  to  the  functional  equations 
occurring  In  the  theory  of  dynamic  programming,  [l].  It 
should  be  noted  that  these  same  methods  are  applicable,  and 
even  more  readily,  to  the  classical  functional  equations  of 
mathematical  physics. 

What  we  wish  to  do  Is  to  trade  additional  computing  time, 
which  la  expensive,  for  additional  memory  capacity,  which 
does  not  exist. 

2.  Dimensionality  Difficulties 

A  typical  problem  arising  In  the  theory  of  control  pro¬ 
cesses  Is  that  of  maximizing  a  functional  of  the  form 


where  x  and  y  are  N-dlmenslonal  vectors  related  by  the 
differential  equation 

(2)  ^  -  h(x,y),  x(0)  »  c. 


As  we  have  discussed  elsewhere  at  some  length, 


# 
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questlons  of  this  nature,  althou^  nominally  within  the  domain 
of  the  calculus  of  variations.  In  actuality  cannot  be  reduced 
to  the  point  of  numerical  solution  by  means  of  classical 
techniques . 

Writing 

(3)  Max  J(y)  -  f(c,T), 

y 

the  theory  of  dynamic  programming  replaces  the  foregoing 
variational  problem  by  that  of  solving  the  nonlinear  partial 
differential  equation 

(4)  =  Max  (g(c,v)  +  (h(c,v),^/Pc)], 

^  v 

r(o,o)  .  0, 


where 


For  computational  purposes.  It  Is  often  convenient  to  use  the 
approximate  difference  equation 

(6)  f(c,T  a)  »  Max  g(c,v)A  +  f(c  -f  h(c,v)A,T)  . 

V 

In  terms  of  the  capacities  of  modem  computers,  we  have  an 
extremely  efficient  algorithm  If  N  °  1,  a  scalar  problem, 
and  a  feasible  algorithm  If  N  2.  If  N  =  3  or  more,  we 
face  fast  memory  difficulties  1 f  we  attempt  to  proceed  In  a 
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routlne  fashion. 

The  reason  for  this  la  the  following.  To  store  a  function 
of  N  variables  In  the  usual  way,  we  tabulate  the  values  of 
the  function  at  a  set  of  lattice  points  within  the  domain  of 
Interest.  If  there  are  M  different  possible  values  of  c^^, 
of  0^,  and  so  on,  the  total  number  of  grid  points  will  be 
M^.  For  M  -  100,  and  N  ■  3#  this  yields  a  quantity  out¬ 
side  of  present  capabilities. 

In  various  papers,  [4j,  |^5]»  we  have  Indicated  some 
methods  which  enable  us  to  circumvent  these  difficulties. 

These  methods  combine  analytic  devices  with  the  method  of 
successive  approximation.  In  this  paper,  we  wish  to  present  a 
new  method,  based  upon  approximation  techniques,  which  appears 
to  have  wide  applicability. 


3.  One-dlmenslonal  Case 

In  order  to  Illustrate  the  application  of  the  method  In 
Its  simplest  form,  let  us  consider  the  problem  of  determining 
the  sequence  of  functions  |f^(c)j,  n  •  1,2,...,  given  by 
the  recurrence  relation 


(l)  f,(c)  -  Max  g(c,v), 

V 

r 

f  (c)  =.  Max  ig(c,v)  +  f^(h(c,v)) 

V  i 


Let  us  suppose  that 
that  the  function 


c  takes  values  only  over  —  1,  l]  and 
h(c,v)  sloillarly  aasuracs  values  over  this 
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Interval  for  all  c-values  and  all  permissible  v-values. 

The  standard  approach  Involves  a  grid  of  values  In  the 


Interval 


-  1, 


where  the  number  of  grid-points  depends 
upon  the  accuracy  that  we  desire.  Let  us  proceed  In  a 
different  manner.  In  place  of  considering  that  the  function 
Is  determined  by  the  set  of  grid-points,  we  shall  consider 
the  function  to  be  determined  by  a  Fourier  expansion  In  terms 
of  a  suitable  orthonormal  set.  For  the  Interval  —  1,  1 
a  convenient  set  Is  the  set  of  normalized  Legendre  polynomials. 
Thus,  for  some  fixed  vaiue  H,  we  write 


(2) 


This  Is,  of  course,  an  approximation  comparable  with  that  of 
using  a  finite  number  of  grid-points. 

We  could  envisage  using  a  power  series  expansion  rathei 
than  an  expansion  in  terms  of  orthogonal  functions.  Generally 
speaking,  an  orthogonal  expansion  Is  to  be  preferred,  both  on 
the  grounds  of  accuracy  and  the  grounds  of  ease  of  determina¬ 
tion  of  the  coefficients.  As  far  as  the  calculation  of  the 
values  of  concerned,  the  simple  three-term 

recurrence  relations  connecting  the  successive  members  of  the 
sequence  of  Legendre  polynomials  make  the  computation  of  these 

values  not  much  more  difficult  than  that  of  the  calculation  of 

. .  k 

the  powers  c  . 

The  function  replaced,  as  far  as  storage 

In  the  computer  Is  concerned,  by  the  set  of  coefficients 
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Whenever  we  wish  a  value  of  compute  It  by  means  of 

the  approximate  relation  in  (2).  Consequently,  In  computing 
the  values  of  (l)#  require  only  the 

storage  of  the  set  of  values  given  by  A^. 

There  Is,  however,  a  difficulty  In  this  approach.  The 
coefficients  are  determined  by  the  relations 


How  are  we  going  to  calculate  them?  If  we  replace  the  Integral 
by  a  simple  Rlemann  sum,  we  are  led  back  to  the  necessity  for 
tabulating  the  values  of  at  a  set  of  grid-points. 

It  Is  here  that  we  Invoke  the  theory  of  mechanical  quad¬ 
rature.  In  place  of  evaluating  the  Integrals  In  (4)  by  means 
of  Rlemann  sums,  we  use  an  Interpolation  formula  of  the 
following  form: 


(5) 


fn(o)P^(c)(lo 


where  the  and  Cj  are  carefully  chosen. 

If  the  quantities  Cj  are  chosen  to  be  the  zeros  of  the 

Legendre  polynomial  of  degree  S,  and  the  coefficients  a^ 

are  chosen  to  be  the  Chrlstoffel  numbers,  the  formula  In  (5) 

la  exact  If  the  Integrand  f  (c)P  (c)  Is  a  polynomial  of 

11 

degree  2S  -  1 . 
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Slnce  the  quantities  and  Cj  are  tabulated,  and  In 

any  case,  could  easily  be  determined  In  advance,  the  amount 

of  effort  entailed  depends  upon  the  choice  of  S.  Since  our 

original  approximation,  contained  In  (2),  la  equivalent  to 

that  of  astumlng  that  ^  polynomial  of  degree  R, 

It  would  be  reasonable  to  take  S  equal  to  R.  However, 

there  Is  no  necessity  for  doing  this. 

If  we  use  the  equation  In  (5)  to  compute  the  values  of 

the  coefficient  set  A^,  it  lo  clear  that  at  each  stage  we 

require  only  the  values  f^(Cj),  J  *=  1,2,...,R. 

The  computation  then  proceeds  In  the  following  fashion. 

Given  the  values  of  f^(Cj),  we  compute  the  coefficient  set 

A  .  Using  these  coefficients  we  can  determine  the  values  of 
n 

f^(h(c,v))  which  occur  In  (l)  in  the  course  of  computing  the 

new  values 

n+  i  j 

4.  Discussion 

Apart  from  the  fixed  set  of  instructions,  and  the  values 
such  as  which  are  determined  at  the  beginning  of 

the  process,  we  require  the  set  A^  *  [a^  n''*‘'^R  n] 

to  be  retained  In  the  fast  memory  at  the  n-th  stage.  This  Is 
a  set  of  (R  +  l)  values. 

On  the  other  hand,  a  grid  size  of  &  over  -  1,  1^ 
would  require  1/&  values.  In  one  dimension,  the  difference 
between  rt  +  1  and  1/6  Is  not  particularly  Important,  and 
the  great  amount  of  additional  computation  required  by  the 
method  described  above  can  more  than  outweigh  this  advantage. 


p-1176 
R*¥l8ed  4-28-59 
-7- 


Conslder,  however,  the  two-dlmenelonal  case.  The  straight¬ 
forward  approach  based  upon  a  grid  size  of  8  In  the  and 

Cjj  Intervals  requires  (1/6)^  values  of 
other  hand,  if  we  set 


(1) 


.Cg) 


and  proceed  as  above,  we  require  only  R(R  1  )/2  values,  the 

coefficients  a,  .  ,  1,J  ■  0,1,..., R. 

1  j  f  n 

Proceeding  to  three  dlaenslons,  we  compare  (l/6)^  and 
R(R  +  1)(R  +  2)/6,  the  number  of  coefficients 
us  use  some  typical  values  of  6  and  R  and  compare  the 
values,  say  8  •  .01  and  R  •  5»10. 


Dimension 

(1/6)“ 

R  -  5 

R  -  10 

1 

100 

6 

11 

2 

10“ 

21 

55 

3 

106 

56 

286 

4 

108 

i 

126 

1001 

5 

0 

0 

252 

3003 

We  see  that  variational  problems  involving  four  and  five 
state  variables  which  are  completely  untouchable  by  direct 
methods  are  within  the  scope  of  the  method  we  have  outlined. 
Combining  this  method  with  the  Lagrange  multiplier  technique, 
and  the  method  of  successive  approximations,  we  have  a  way  of 
attacking  previously  impregnable  problems. 
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5.  A  Numerical  EIxample 

Consider  the  problem  of  determining  the  sequence 
so  as  to  minimize  the  function 


{^nj 


(1) 


N  „  N  , 
F„  -  X  uj  +  I  v^. 
"  k=l  k-1 


where 


(2)  (a)  u 


n+1 


^  ''n>  “O  “  <=' 


(3) 


(b)  V  must  be  chosen  so  as  to  keep  u  ,  within  the 
'  '  n  ^  n+1 

Interval  1,  l]. 

Introducing  for  N  2  1  and  --  1  ^  c  ^  1,  the  function 

fjj(c)  -  mn  Pj^, 


we  derive  the  recurrence  relation 


(4) 


fh,(c) 


Min 

V 


c^  -►  Av^  +  fj^^(2c  —  c^  v) 


N  2  2, 


f^Cc) 


c  . 


This  yields  a  simple  computational  determination  of  the 
sequence  {r,(c)j. 

As  a  test  of  the  foregoing  method,  this  sequence  was 
first  determined  by  the  usual  method  based  upon  a  grid  of 
values  over  [—  1,  1  ,  and  then  following  the  procedure 


described  above,  using  the  approximation 


P-1176 
Revised  4-28-59 
-9- 


(5)  fj,(c)  « 

and  a  value  of  S  »=  l4  In  (3.5). 

Let  |f^(c)j  denote  the  sequtnce  determined  using  the 
grid,  and  sequence  obtained  via  .^^egendre 

polynomials.  A  comparison  of  values  Is  given  below  for 

k  »  6 . 


0 

fj(c) 

fgtc) 

1.0 

1.782 

1 

1.77 

.8 

1.370 

1.36 

.2 

.153 

.  145 

0.0 

.006 

0.0 

-  .2 

.202 

.20 

-  .8 

4.876 

4.89 

-1.0 

8.666 

8.67 

As  can  be  seen,  the  agreement  Is  quite  good. 
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